Feat: Handle Adjoints through Initialization #1168

DhairyaLGandhi · 2025-03-06T12:28:09Z

Checklist

Appropriate tests were added
Any code changes were done in a way that does not break public API
All documentation related to code changes were updated
The new code follows the
contributor guidelines, in particular the SciML Style Guide and
COLPRAC.
Any new documentation only uses public API

Additional context

MTK and SciML construct an initialization problem before starting the time stepping to ensure the starting values of the unknowns and parameters adhere to any constraints needed for the system. This PR adds handling for adjoint sensitivities of the NonlinearProblem, NonlinearSquaresProblem, SCCNonlinearProblem etc.

I am opening this to get some feedback regarding how we can accumulate gradients correctly. I have also included a test case for a DAE which I will update to use the values out of SciMLSensitivity.

Add any other context about the problem here.

Currently the gradients get calculated but don't get accumulated, we need to be able to update the gradients for the parameters. Since this is a manual dispatch, the usual graph building in AD is bypassed, and we need to handle this manually. Ideally, we should make it so the cfg itself includes the initialization so we would not have gotten incorrect gradients in the first place 😅 We are also forced to use a LinearProblem instead of \ because it cannot handle singular jacobians.

cc @ChrisRackauckas

DhairyaLGandhi · 2025-03-06T12:44:31Z

I wanted to ask whether it is preferred to retain \ for the smaller problems, which would be slower with LinearSolve, and if so, we will need to deal with LAPACK errors.

ChrisRackauckas · 2025-03-08T12:45:46Z

I wanted to ask whether it is preferred to retain \ for the smaller problems, which would be slower with LinearSolve, and if so, we will need to deal with LAPACK errors.

LinearSolve.jl should be faster across the board? It depends a bit on the CPU architecture since it depends on whether it guesses the right LU correctly,

ChrisRackauckas · 2025-03-08T12:46:50Z

Note that with the latest MTK update, there is now an Initials section where the initial u0 live. That should fix a few things.

ChrisRackauckas · 2025-03-08T12:48:56Z

src/concrete_solve.jl

@@ -425,6 +425,21 @@ function DiffEqBase._concrete_solve_adjoint(
            save_end = true, kwargs_fwd...)
    end

+    # Get gradients for the initialization problem if it exists
+    igs = if _prob.f.initialization_data.initializeprob != nothing


this should be before the solve, since you can use the initialization solution from here in the remakes of 397-405 in order to set new u0 and p and thus skip running the initialization a second time.

How can I indicate to solve to avoid running initialization?

initializealg = NoInit(). Should probably just do CheckInit() for safety but either is fine.

ChrisRackauckas · 2025-03-08T12:54:38Z

src/steadystate_adjoint.jl

@@ -103,15 +102,18 @@ end
    else
        if linsolve === nothing && isempty(sensealg.linsolve_kwargs)
            # For the default case use `\` to avoid any form of unnecessary cache allocation


Yeah I don't know about that comment. I think it's just old. (a) \ always allocates because it uses lu instead of lu!, so it's re-allocating the while matrix which is larger than any LinearSolve allocation, and (b) we have since 2023 setup tests on StaticArrays, so the immutable path is non-allocating. I don't think (b) was true when this was written.

So glad we can remove this branch altogether.

ChrisRackauckas · 2025-03-08T12:56:27Z

src/concrete_solve.jl

+        iprob = _prob.f.initialization_data.initializeprob
+        ip = parameter_values(iprob)
+        itunables, irepack, ialiases = canonicalize(Tunable(), ip)
+        igs, = Zygote.gradient(ip) do ip


This gradient isn't used? I think this would go into the backpass and if I'm thinking clearly, the resulting return is dp .* igs?

Not yet. These gradients are currently against the parameters of the initialization problem, not the system exactly. And the mapping between the two is ill defined, so we cannot simply accum

I spoke with @AayushSabharwal about a way to map, it seems initialization_data.intializeprobmap might have some support to return the correctly shaped vector, but there are cases where we cannot know the ordering of dp either.

There's another subtlety. I am not sure we haven't missed some part of the cfg by manually handling accumulation of gradients. Or any transforms we might need to calculate gradients for. The regular AD graph building typically took care of these details for us, but in this case we would need to worry about incorrect gradients manually

Oh yes, you need to use the initializeprobmap https://github.com/SciML/SciMLBase.jl/blob/master/src/initialization.jl#L268 to map it back to the shape of the initial parameters.

but there are cases where we cannot know the ordering of dp either.

p and dp just need the same ordering, so initializeprobmap should do the trick.

There's another subtlety. I am not sure we haven't missed some part of the cfg by manually handling accumulation of gradients. Or any transforms we might need to calculate gradients for. The regular AD graph building typically took care of these details for us, but in this case we would need to worry about incorrect gradients manually

This is the only change to (u0,p) before solving, so this would account for it, given initializeprobmap is just an index map so an identity function.

Addressed this occurance in 95ebbf3 to check if this is correct. Will need to work around the global call

DhairyaLGandhi · 2025-03-10T14:10:19Z

Trying to use the initialization end to end caused gradients against parameters to get dropped. https://github.com/DhairyaLGandhi/SciMLBase.jl/tree/dg/nonlinear is a WIP branch which adds adjoints to the getindex calls, which does capture the expected gradients, but we still end up dropping gradients somewhere in the chain. I am looking into whether we are doing so in the custom adjoints, because that was an issue I had identified for the ODE case.

…tion problem

ChrisRackauckas · 2025-03-12T16:21:36Z

src/concrete_solve.jl

+            new_u0, new_p, _ = SciMLBase.get_initial_values(new_prob, new_prob, new_prob.f, SciMLBase.OverrideInit(), Val(true);
+                                                            abstol = 1e-6,
+                                                            reltol = 1e-6,
+                                                            sensealg = SteadyStateAdjoint(autojacvec = ZygoteVJP()))


shouldn't default to ZygoteVJP. Should use the autojacvec of the ODE

Addressed in 9a8a845

DhairyaLGandhi · 2025-03-13T10:57:16Z

I could use some understanding of how to handle initialization when MTK analytically solves the problem, and removes all the unknowns. In that case u0, u etc are empty, and we attempt to calculate the jacobian for it in dgdgu_val which seems weird, there are no unknowns to calculate the jacobian for. It would be equally incorrect to use the unknowns from the system itself. The best way to handle this might be to simply add a check for isempty and return. Is that reasonable?

test/mtk.jl

DhairyaLGandhi · 2025-04-23T19:14:19Z

There seems to be an InitialFailure with DefaultInit

DhairyaLGandhi · 2025-04-23T19:23:09Z

julia> prob, ps, init = setups[3];

julia> init
OrdinaryDiffEqCore.DefaultInit()

julia> new_sol = solve(prob, Rodas5P(); p = p, sensealg, initializealg = init, abstol = 1e-6, reltol = 1e-3)
┌ Warning: Initialization system is overdetermined. 1 equations for 0 unknowns. Initialization will default to using least squares. `SCCNonlinearProblem` can only be used for initialization of fully determined systems and hence will not be used here. To suppress this warning pass warn_initialize_determined = false. To make this warning into an error, pass fully_determined = true
└ @ ModelingToolkit ~/.julia/packages/ModelingToolkit/aau6A/src/systems/diffeqs/abstractodesystem.jl:1520
retcode: InitialFailure
Interpolation: specialized 4rd order "free" stiffness-aware interpolation
t: 1-element Vector{Float64
}:
 0.0
u: 1-element Vector{Vector{Float64}}:
 [2.0, 0.0, 0.0, 1.0, 0.0]

ChrisRackauckas · 2025-04-23T21:02:04Z

test/mtk.jl

+    w2 => -1.0,]
+prob_correctu0 = ODEProblem(sys, u0_correct, tspan, p, jac = true, guesses = [w2 => -1.0])
+mtkparams_correctu0 = SciMLSensitivity.parameter_values(prob_correctu0)
+prob_correctu0.u0[5] = -1.0


@AayushSabharwal note that this is the issue of not passing over the initial values and guess values to u0. Without this CheckInit will fail. It's a minor issue but it is something we should handle over time.

Yeah I'm working on using the hook in DiffEqBase. It's easy enough to implement, but somehow the returned problem's u0 isn't being respected

…to dg/initprob

test/mtk.jl

Co-authored-by: Christopher Rackauckas <[email protected]>

…to dg/initprob

ChrisRackauckas · 2025-04-24T12:22:33Z

src/adjoint_common.jl

+# if !hasmethod(Zygote.adjoint,
+#     Tuple{Zygote.AContext, typeof(Zygote.literal_getproperty),
+#         SciMLBase.AbstractTimeseriesSolution, Val{:u}})
+#     Zygote.@adjoint function Zygote.literal_getproperty(sol::AbstractTimeseriesSolution,
+#             ::Val{:u})
+#         function solu_adjoint(Δ)
+#             zerou = zero(sol.prob.u0)
+#             _Δ = @. ifelse(Δ === nothing, (zerou,), Δ)
+#             (SciMLBase.build_solution(sol.prob, sol.alg, sol.t, _Δ),)
+#         end
+#         sol.u, solu_adjoint
+#     end
+# end


I made a mistake pushing cdaa2c7 so have reverted it

ChrisRackauckas · 2025-04-24T13:39:10Z

Project.toml

@@ -1,7 +1,7 @@
 name = "SciMLSensitivity"
 uuid = "1ed8b502-d754-442c-8d5d-10ac956f44a1"
 authors = ["Christopher Rackauckas <[email protected]>", "Yingbo Ma <[email protected]>"]
-version = "7.77.0"
+version = "7.76.0"


Ah this was picked from the history, updated. Thanks

DhairyaLGandhi added 6 commits March 6, 2025 15:22

feat(initialization): get gradients against initialization problem

d6290bf

test: add initialization problem to test suite

d3199c0

chore: pass tunables to jacobian

a4fa7c5

test: cleanup imports

5a7dd26

chore: rm debug statements

94ec324

test: add Core8 to CI

0c5564e

ChrisRackauckas reviewed Mar 8, 2025

View reviewed changes

DhairyaLGandhi added 3 commits March 11, 2025 16:13

chore: use get_initial_values

aca9cd4

chore: check for OVERDETERMINED initialization for solving initializa…

9bec784

…tion problem

chore: pass sensealg to initial_values

72cbb35

ChrisRackauckas reviewed Mar 12, 2025

View reviewed changes

DhairyaLGandhi added 3 commits March 13, 2025 15:17

chore: treat Delta as array

2d85e19

chore: use autojacvec from sensealg

9a8a845

chore: move igs before solve, re-use initialization

95ebbf3

DhairyaLGandhi added 2 commits March 13, 2025 16:53

chore: update igs to re-use inital values

957d7fe

chore: qualify NoInit

a00574f

DhairyaLGandhi mentioned this pull request Mar 13, 2025

get_initial_values does not forward kwargs properly SciML/SciMLBase.jl#951

Closed

DhairyaLGandhi added 6 commits March 17, 2025 16:57

chore: remove igs from steady state adjoint gor initialization

4562f0c

chore: accumulate gradients in steady state adjoint explicitly

6c21324

fix: handle MTKparameters and Arrays uniformly

a675a7f

feat: allow reverse mode for initialization solving

7941a3c

test: add more tests for parameter initialization

9557e8c

test: fix label

8feae0e

DhairyaLGandhi and others added 2 commits April 23, 2025 20:31

test: allocate gt based on size of new_sol

a7d4e5a

Update Project.toml

8e660fe

ChrisRackauckas reviewed Apr 23, 2025

View reviewed changes

test/mtk.jl Outdated Show resolved Hide resolved

Update test/mtk.jl

de63cf9

Update mtk.jl

b88f468

ChrisRackauckas reviewed Apr 23, 2025

View reviewed changes

ChrisRackauckas and others added 3 commits April 23, 2025 19:24

Also test u0 gradients

7dd1cc7

chore: merge upstream

cdaa2c7

Merge branch 'dg/initprob' of github.com:SciML/SciMLSensitivity.jl in…

d3608c4

…to dg/initprob

ChrisRackauckas reviewed Apr 24, 2025

View reviewed changes

test/mtk.jl Outdated Show resolved Hide resolved

DhairyaLGandhi and others added 3 commits April 24, 2025 15:29

Update test/mtk.jl

4cf7bd5

Co-authored-by: Christopher Rackauckas <[email protected]>

chore: handle when p is a functor in steady state adjoint

2ae712b

Merge branch 'dg/initprob' of github.com:SciML/SciMLSensitivity.jl in…

229e691

…to dg/initprob

ChrisRackauckas reviewed Apr 24, 2025

View reviewed changes

DhairyaLGandhi added 4 commits April 24, 2025 17:56

Merge branch 'master' into dg/initprob

6e1109e

chore: git mixup

d32b3f2

chore: git mixup

6e549e7

chore: revert bad commit

9789034

ChrisRackauckas reviewed Apr 24, 2025

View reviewed changes

DhairyaLGandhi added 5 commits April 24, 2025 19:19

chore: handle nothing dtunables for SteadyStateAdjoint

35937c0

chore: rm u0 nothing forced to empty array

f934635

chore: reverse order of nothing check

a83cd29

chore: rm dead code

2dfbc6e

chore: DEQ handling

9aecbfd

ChrisRackauckas approved these changes Apr 24, 2025

View reviewed changes

DhairyaLGandhi merged commit 5f6ab43 into master Apr 24, 2025
14 of 18 checks passed

ChrisRackauckas deleted the dg/initprob branch April 24, 2025 22:58

DhairyaLGandhi mentioned this pull request Apr 27, 2025

Fix gradients for parameters getting dropped #1180

Closed

5 tasks

Uh oh!

Feat: Handle Adjoints through Initialization #1168

Feat: Handle Adjoints through Initialization #1168

Uh oh!

Conversation

DhairyaLGandhi commented Mar 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Additional context

Uh oh!

DhairyaLGandhi commented Mar 6, 2025

Uh oh!

ChrisRackauckas commented Mar 8, 2025

Uh oh!

ChrisRackauckas commented Mar 8, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DhairyaLGandhi commented Mar 10, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DhairyaLGandhi commented Mar 13, 2025

Uh oh!

Uh oh!

DhairyaLGandhi commented Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DhairyaLGandhi commented Apr 23, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

DhairyaLGandhi commented Mar 6, 2025 •

edited

Loading

DhairyaLGandhi commented Apr 23, 2025 •

edited

Loading